Slac Parallel Tracking Code Development and Applications
نویسندگان
چکیده
The increase in single processor speed based on Moore’s law alone will not be able to deliver the dramatic speedup needed in many beam tracking simulations to uncover very slowly evolving effects in a reasonable time. SLAC has embarked on an effort to bring the power of parallel computing to bear on such computations with the goal to reduce the turnaround time by orders of magnitude so that the results may impact present facilities and future machine designs. This poster will describe the approaches adopted for parallelizing the LIAR code and the ION_MAD code. The scalability of these tracking codes and their further improvement will be discussed. 1 PIPELINE MODEL FOR PARALLELISM Beam tracking applications model a beam (composed of many bunches of particles) travelling through an accelerator (defined as a lattice of optical elements). Each bunch travels past each optical element in order, and after all the preceding bunches. The nature of these applications imposes an order on the calculations. This limits the possible approaches to parallelism, which seeks to exploit independent calculations that can occur simultaneously. One successful approach to parallelism for these codes is the pipeline model. The key observation is that bunch bi can be processed by element ei, and at the same time bunch bi+1 can be processed by element ei-1. The strategy for pipeline parallelism can be summarised as follows: • Bunches are grouped into P groups of equal size, where P is the number of processes. • Each process stores and computes with one bunch group and one or more optical elements at a time. • When process p is done working on its bunch group, it sends it to process p+1, and simultaneously receives a bunch group from process p-1. • When a process is done working on the last bunch group in the train, it shifts to the next unvisited optical element This strategy can also be modified to communicate the optical elements instead of the beam data. It makes no difference to the pipeline model. The decision is made for the approach that reduces the overall communication cost. For simplicity, we will assume for the rest of this paper that that the beam data is communicated instead of the optical element data. As a small example, assume there are three processors and nine lattice elements. The beam is evenly divided into three bunch groups. Bunches within a grouping are communicated together. The lattice elements are distributed cyclically among the three processors. Figure 1 graphically illustrates the pipeline process. During the first pipeline step, process 1 computes the effect of optical element 1 on bunch group 1. The other processors are idle. During the second pipeline step, bunch group 2 enters the simulation at process 1, and bunch group 1 is sent to process 2. Two processors are now computing. The third bunch group enters the simulation at step 3. Now all the processors are working, and the pipeline is full. The pipeline remains full until the first bunch group reaches the last optical element. The bunch groups then begin to leave the simulation, and processors become idle again. The pipeline model can not provide perfect linear speedup due in part to the filling and emptying steps. However, the pipeline speedup can approach the number of processors when the number of stages is large. In the example, it takes 11 pipeline steps to push three bunch groups through nine lattice elements. In serial, this would have taken 27 steps. The speedup due to the Pi pe lin e St ep Element (Pipeline Stage) Bunch group 1 Bunch group 2 Bunch group 3 Process 1 Process 2 Process 3 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 Figure 1: Pipeline example XX International Linac Conference, Monterey, California
منابع مشابه
Advances in Parallel Electromagnetic Codes for Accelerator Science and Development*
Over a decade of concerted effort in code development for accelerator applications has resulted in a new set of electromagnetic codes which are based on higher-order finite elements for superior geometry fidelity and better solution accuracy. SLAC’s ACE3P code suite is designed to harness the power of massively parallel computers to tackle large complex problems with the increased memory and so...
متن کاملTracking Code Development for Beam Dynamics Optimization
Dynamic aperture (DA) optimization with direct particle tracking is a straight forward approach when the computing power is permitted. It can have various realistic errors included and is more close than theoretical estimations. In this approach, a fast and parallel tracking code could be very helpful. In this presentation, we describe an implementation of storage ring particle tracking code TE...
متن کاملParallel Curved Mesh Adaptation for Large Scale High-Order Finite Element Simulations
This paper presents the development of a parallel adaptive mesh control procedure designed to operate with high-order finite element analysis packages to enable large scale automated simulations on massively parallel computers. The curved mesh adaptation procedure uses curved entity mesh modification operations. Applications of the curved mesh adaptation procedure have been developed to support...
متن کامل3-RPS Parallel Manipulator Dynamical Modelling and Control Based on SMC and FL Methods
In this paper, a dynamical model-based SMC (Sliding Mode Control) is proposed fortrajectory tracking of a 3-RPS (Revolute, Prismatic, Spherical) parallel manipulator. With ignoring smallinertial effects of all legs and joints compared with those of the end-effector of 3-RPS, the dynamical model ofthe manipulator is developed based on Lagrange method. By removing the unknown Lagrange multipliers...
متن کاملMultiphysics Applications of Ace3p∗
The TEM3P module of ACE3P, a parallel finite-element electromagnetic code suite from SLAC, focuses on the multiphysics simulation capabilities, including thermal and mechanical analysis for accelerator applications. In this paper, thermal analysis of coupler feedthroughs to superconducting rf (SRF) cavities will be presented. For the realistic simulation, internal boundary condition is implemen...
متن کامل